tractable probabilistic knowledge base
Tractable Probabilistic Knowledge Bases: Wikipedia and Beyond
Niepert, Mathias (University of Washington) | Domingos, Pedro (University of Washington)
Building large-scale knowledge bases from a variety of data sources is a longstanding goal of AI research. However, existing approaches either ignore the uncertainty inherent to knowledge extracted from text, the web, and other sources, or lack a consistent probabilistic semantics with tractable inference. To address this problem, we present a framework for tractable probabilistic knowledge bases (TPKBs). TPKBs consist of a hierarchy of classes of objects and a hierarchy of classes of object pairs such that attributes and relations are independent conditioned on those classes. These characteristics facilitate both tractable probabilistic reasoning and tractable maximum-likelihood parameter learning. TPKBs feature a rich query language that allows one to express and infer complex relationships between classes, relations, objects, and their attributes. The queries are translated to sequences of operations in a relational database facilitating query execution times in the sub-second range. We demonstrate the power of TPKBs by leveraging large data sets extracted from Wikipedia to learn their structure and parameters. The resulting TPKB models a distribution over millions of objects and billions of parameters. We apply the TPKB to entity resolution and object linking problems and show that the TPKB can accurately align large knowledge bases and integrate triples from open IE projects.
Tractable Probabilistic Knowledge Bases with Existence Uncertainty
Webb, W. Austin (University of Washington) | Domingos, Pedro (University of Washington)
A central goal of AI is to reason efficiently in domains that are both complex and uncertain. Most attempts toward this end add probability to a tractable subset of first-order logic, but this results in intractable inference. To address this, Domingos and Webb (2012) introduced tractable Markov logic (TML), the first tractable first-order probabilistic representation. Despite its surprising expressiveness, TML has a number ofsignificant limitations. Chief among these is that it does not explicitly handle existence uncertainty, meaning that all possible worlds contain the same objects and relations. This leads to a number of conceptual problems, such as models that must contain meaningless combinations of attributes (e.g.,horses with wheels). Here we propose a new formalism, tractable probabilistic knowledge bases (TPKBs), that overcomes this problem. Like TML, TPKBs use probabilistic class and part hierarchies to ensure tractability, but TPKBs have a much cleaner and user-friendly object-oriented syntax and a well-founded semantics for existence uncertainty. TML is greatly complicated by the use of probabilistic theorem proving, an inference procedure that is much more powerful than necessary. In contrast, we introduce an inference procedure specifically designed for TPKBs, which makes them far more transparent and amenable to analysis and implementation. TPKBs subsume TML and therefore essentially all tractable models, including many high-treewidth ones.